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Abstract 

Background: Laribacter hongkongensis is associated with community-acquired gastroenteritis and traveler's 
diarrhea. In this study, we performed an in-depth annotation of the genes in its genome related to the various 
steps in the infective process, drug resistance and mobile genetic elements. 

Results: For acid and bile resistance, L hongkongensis possessed a urease gene cassette, two arc gene clusters and 
bile salt efflux systems. For intestinal colonization, it possessed a putative adhesin of the autotransporter family 
homologous to those of diffusely adherent Escherichia coli (E coli) and enterotoxigenic E coli. To evade from host 
defense, it possessed superoxide dismutase and catalases. For lipopolysaccharide biosynthesis, it possessed the 
same set of genes that encode enzymes for synthesizing lipid A, two Kdo units and heptose units as E coli, but 
different genes for its symmetrical acylation pattern, and nine genes for polysaccharide side chains biosynthesis. It 
contained a number of CDSs that encode putative cell surface acting (RTX toxin and hemolysins) and intracellular 
cytotoxins (patatin-like proteins) and enzymes for invasion (outer membrane phospholipase A). It contained a 
broad variety of antibiotic resistance-related genes, including genes related to p-lactam (n = 10) and multidrug 
efflux (n = 54). It also contained eight prophages, 17 other phage-related CDSs and 26 CDSs for transposases. 

Conclusions: The L hongkongensis genome possessed genes for acid and bile resistance, intestinal mucosa 
colonization, evasion of host defense and cytotoxicity and invasion. A broad variety of antibiotic resistance or 
multidrug resistance genes, a high number of prophages, other phage-related CDSs and CDSs for transposases, 
were also identified. 



Background 

In 2001, Laribacter hongkongensis, a novel genus and 
species that belongs to the Neisseriaceae family of (3- 
subclass of the Proteobacteria, was discovered from the 
blood and empyema pus of a patient with underlying 
alcoholic cirrhosis [1]. Subsequently, it was observed 
that L. hongkongensis was associated with freshwater fish 
borne community- acquired gastroenteritis and traveler's 
diarrhea in human [2-7]. The clinical syndrome of asso- 
ciated gastroenteritis is similar to those of Salmonella or 
Campylobacter gastroenteritis. About 80% and 20% of 
the patients have watery and bloody diarrhea 
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respectively, one third of them have systemic symptoms 
and another one third have vomiting [4]. Pulsed-field 
gel electrophoresis of Spel digested chromosomal DNA 
and multilocus sequence typing using seven housekeep- 
ing gene loci independently showed that the L. hongkon- 
gensis isolates recovered from freshwater fish and 
patients fell into separate clusters. These suggested that 
some L. hongkongensis clones could be more virulent or 
adapted to human than others [8,9]. 

For a gastrointestinal tract pathogen to cause infec- 
tion, after transmission through the oral route, the bac- 
terium has to be able to survive the hostile acidic 
environment of the stomach, resist the action of bile in 
the small intestine, colonize the gastrointestinal tract 
epithelium through binding of adhesins of the bacterium 
to receptors on epithelial cells, evade host immune 
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defense mechanisms before causing diarrhea and/or 
invading the gastrointestinal tract and cause systemic 
infections, as in the case of bacteremia and empyema 
thoracis [1]. Moreover, the possession of drug resistance 
determinants and phages also enhance the potential cap- 
ability of the bacterium to resist to killing by antimicro- 
bials and causing diseases. In this article, we present an 
overview of the genes and gene cassettes of the L. hon- 
gkongensis genome related to these various steps in the 
infective process, as well as drug resistance and phages. 
The phylogeny of these genes, most of them were 
thought to be acquired through horizontal gene transfer, 
was also analyzed. 

Results and discussion 

Resistance to acid 
Urease 

Similar to other gastrointestinal tract pathogens, L. hon- 
gkongensis has to face the highly hostile and acidic 
environment of the stomach before reaching the intes- 
tine. L. hongkongensis possesses a urease, that is able to 
hydrolyze the limited amount of urea available in the 
stomach to generate carbon dioxide and ammonia, 
which increases the pH. In the L. hongkongensis genome, 
a complete urease cassette, that occupies a 7,556 bp 
region, is observed. The cassette includes eight CDSs, 
which encodes three urease structural proteins (UreA, 
UreB and UreC) and five accessory proteins (UreE, 
UreF, UreG, UreD and Urel) [10]. Similar to the urease 
of other bacteria, the urease of L. hongkongensis is pre- 
sumably a nickel containing enzyme [11]. The histidine 
residues at the carboxyl terminal of UreE are supposed 
to bind to the nickel ions that are transported into L. 
hongkongensis through a nickel transporter, and donate 
the nickel ions to UreC during urease activation. Most 
of the eight genes in the urease cassette of L. hongkon- 
gensis are most closely related to their homologues in 
bacteria of a- and y-proteobacteria, rather than those in 
other bacteria of P-proteobacteria [12-16]. 
Arginine deiminase 

Two arc gene clusters were encoded in the L. hongkon- 
gensis genome. Each cluster consists of four genes, arcA, 
arcB, arcC and arcD. arcA, arcB and arcC encode the 
three enzymes, arginine deiminase, ornithine carbamoyl- 
transferase and carbamate kinase, of the arginine deimi- 
nase pathway, whereas arcD encodes a membrane 
bound arginine-ornithine antiporter. The arginine dei- 
minase pathway converts L-arginine to carbon dioxide, 
ATP, and ammonia, which increases the pH. It has been 
shown in various bacteria, such as Streptococcus sanguis, 
Streptococcus rattus, Streptococcus suis, Streptococcus 
pyogenes, Enterococcus faecium and Pseudomonas aeru- 
ginosa that this gene cluster is useful for bacterial survi- 
val in acidic environment [17-19]. In S. pyogenes, it has 



also been shown that this pathway facilitates cell inva- 
sion and inhibits proliferation of human peripheral 
blood mononuclear cells [20,21]. Phylogenetically, these 
four genes of the arc gene cluster in L. hongkongensis 
are most closely related to the corresponding homolo- 
gues in Chromobacterium violaceum (Figure 1, 2, 3, and 
4), whereas the gene cluster is absent in Neiserria 
meningitidis and Neisseria gonorrhoeae. Among all bac- 
teria with complete genomes sequenced, L. hongkongen- 
sis is the only one that contains two adjacent arc gene 
clusters (Figure 5). 

Bile resistance 
Efflux pumps 

Efflux of bile salts from bacteria is mediated through a 
number of efflux systems. These efflux systems pump a 
variety of compounds, including antibiotics, oxidative 
stress agents, organic solvents and bile salts, out of the 
bacterial cytoplasm. Among these efflux systems, the 
best studied one is encoded by acrAB-tolC of the resis- 
tance nodulation division (RND) family. This system has 
been shown to be present in the genomes of a variety of 
pathogenic and non-pathogenic bacteria of the human 
gastrointestinal tract, such as Escherichia coli and Sal- 
monella Typhimurium [22,23]. In the L. hongkongensis 
genome, three complete copies of acrAB-tolC, of which 
AcrB is located in the inner membrane and contains the 
conserved ACR_tran domain, AcrA is located in the 
periplasmic space and contains the conserved HlyD 
domain and TolC as the outer membrane channel pro- 
tein, are present. A recent bioinformatics analysis on 
bile resistance mechanisms in Campylobacter ales also 
found that one complete copy of acrAB-tolC is present 
in the C. jejuni genome [24]. In addition to efflux 
pumps encoded by acrAB-tolC, the genome of L. hon- 
gkongensis also contains two copies of emrAB-tolC of 
the major facilitator superfamily, one copy of acrAD- 
tolC of the RND family (AcrD is also an inner mem- 
brane protein and contains the conserved ACR_tran 
domain similar to AcrB), one copy of mdtABC-tolC of 
the RND family and one copy of ydgFE/mdtJI of the 
small multidrug resistance family. These four gene cas- 
settes were also found to be encoding efflux pumps 
related to bile resistance in E. coli [22,25-27]. In addi- 
tion, acrAD-tolC and mdtABC-tolC have been documen- 
ted to be related to bile salt resistance in S. 
Typhimurium [28]. 

Lipopolysaccharide (LPS) and Tol proteins 

In addition to the efflux pumps, the integrity of the 
outer membrane is also important in resistance against 
bile. The O -antigen has been shown to be related to bile 
resistance in S. Typhimurium [29,30]. Tol proteins, 
which are cytoplasmic and periplasmic proteins encoded 
by a gene cluster that consists of five genes (tolQ, tolR, 
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-Staphylococcus aureus (NP_373 159.1) 



-Hahella chejuemis (YP_4 36578. 1) 



- Burkholdsria thailandensis (ZP_02 37472 1 . 1) 



-Pseudomoms aeruginosa (ZP_0 1368190.1) 
Marinomoms sp. (YP_001 340834. 1) 



- Burkholdsria ambifaria (ZP_02 9080 32. 1) 



-ArcA-1, Laribacter hongkongensis (LHK_02729) 
— Chromobacterium violaceum (NP_90 345 2. 1) 
Arc A- 2, Laribacter hongkongensis (LHK_02734) 



- Sinorhizobium meliloti (NP_4 35613. 1) 



Rhizobiumetli (YP_0 01 9763 02.1) 

Figure 1 Phylogenetic analysis of ArcA encoded in the arc gene cluster in L hongkongensis. The tree was constructed by neighbor 
joining method using Kimura's correction and bootstrap values calculated from 1000 trees. Four hundred and nine and 409 amino acid 
positions in ArcA1 and ArcA2, respectively, were included in the analysis. The scale bars indicate the estimated number of substitutions per 10 
amino acids. All names and accession numbers are given as cited in the GenBank database. 



-Staphylococcus aureus (NP_37 1693.1) 



-Hahella chejuensis (YP_4 36577. 1) 



—Mycobacterium vanbaalenii (YP_9 52230.1) 



- Burkholdsria ubonsnsis (ZP_02 379 557.1) 



- Pseudomoms entomopkla (YP_60 9896.1) 
Dichelobacter nodosus (YP_00 1 209 961.1) 



-Mannheimia succiniciproducens (YP_088265. 1) 



—Actinobacillus pleuropneumoniae (YP_001054006. 1) 



-Haemophilus influenzae (NP_438753. 1) 



Chromobacterium violaceum (NP_90 345 1 . 1) 

■ ArcB-2, Laribacter hongkon&nsis (LHK_02733) 

ArcB- 1 , Laribacter hongkongsnsis (LHK_02728) 

Figure 2 Phylogenetic analysis of ArcB proteins encoded in the arc gene cluster in L hongkongensis. The tree was constructed by 
neighbor joining method using Kimura's correction and bootstrap values calculated from 1000 trees. Three hundred and thirty-four and 335 
amino acid positions in ArcB1 and ArcB2, respectively, were included in the analysis. The scale bar indicates the estimated number of 
substitutions per 20 amino acids. All names and accession numbers are given as cited in the GenBank database. 
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-Staphylococcus aureus (NP_37 1 694 . 1 ) ' _ 



-Burkholderiaphymatum (YP_001 862517. 1) 
Haemophilus influenzae (YP_0 01291508.1) 



-Pseudomonas aeruginosa (YP_00 1351230.1) 



-Dichelobacler nodosus (YP_00 1209960.1) 



-Mycobacterium vanbaalenii (YP_9 5222 9. 1) 



-Hahella chejuensis (YP_4 36576. 1) 



-Sinorhizobium medicae (YP_00 1313914.1) 



Rhizobium etli (YP_001976300. 1) 

ArcC- 1 , Laribacter hongkongensis (LHK_02727) 

-ArcC-2, Laribacter hongkongensis (LHK_02732) 



- Chrom obacterium violaceum (NP_903450. 1) 



Figure 3 Phylogenetic analysis of ArcC proteins encoded in the arc gene cluster in L hongkongensis. The tree was constructed by 
neighbor joining method using Kimura's correction and bootstrap values calculated from 1000 trees. Two hundred and ninety-one and 314 
amino acid positions in ArcG and ArcC2, respectively, were included in the analysis. The scale bars indicate the estimated number of 
substitutions per 10 amino acids. All names and accession numbers are given as cited in the GenBank database. 



-Staphylococcus aureus (NP_373 1 57. 1 ) 



-Chromobacterium violaceum (NP_903453. 1) 



- ArcD-3, Laribacter hongkongensis (LHKJD2731) 



-ArcD-2, Laribacter hongkongensis (LHK_02730) 



-ArcD-1, Laribacter hongkongensis (LHK_00294) 



-Hahella chejuensis (YP_4 36579. 1) 



-Mycobacterium vanbaalenii (YP_952232. 1) 



-Burkholderia thailandensis (YP_442904. 1) 



-Pseudomonas stutzeri (YPJJ01 174442. 1 ) 



1 Aeromonas salmonicida (YP_00 1 141741 .1) 

Figure 4 Phylogenetic analysis of ArcD encoded in the arc gene cluster in L hongkongensis. The tree was constructed by neighbor 
joining method using Kimura's correction and bootstrap values calculated from 1000 trees. Four hundred and ninety-two, 478 and 478 amino 
acid positions in ArcD1, ArcD2 and ArcD3, respectively, were included in the analysis. The scale bars indicate the estimated number of 
substitutions per 10 amino acids. All names and accession numbers are given as cited in the GenBank database. 

V J 
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Laribacter hongkongensis 
Chromobacterium violaceum 
Pseudomonas aeruginosa 
Rhizobium etli 
Burkholderia pseudomallei 
Bordetella petrii 

Pseudomonas stutzeri 




arcD-3 



arcC-2 



arcB-2 



arcA-2 



arcD-2 



arcA-2 



arcB-2 



arcC-2 



Figure 5 Genetic organization of ADI clusters in L hongkongensis and other representative microbial genomes. The arrow boxes 
represent the CDSs. The relative positions of each gene are assigned as predicted by nucleotide sequence analysis. 



tolA, tolB and pal), are also important in maintaining 
the integrity of the outer membrane and bile resistance, 
as shown in E. coli, S. Typhimurium and Erwinia chry- 
santhemi [31-33]. In the genomes of L. hongkongensis 
and C. violaceum, tolQ was not clustered with tolR, 
tolA, tolB and pal, although all five genes are present in 
their genomes. 

Colonization of intestinal mucosa 

The first step of infection is adhesion to host cells. In 
the L. hongkongensis genome, a putative adhesin, with 
27-30% amino acid identity to the adhesins of diffusely 
adherent E. coli (DAEC) [34-36] and enterotoxigenic E. 
coli (ETEC) [37-40], encoded by aidA and tibA respec- 
tively, was observed (Figure 6). It has been shown that 
aidA deletion mutants of DAEC lost the ability to 
adhere to HeLa cells and tibA deletion mutants of 
ETEC lost the ability to adhere to human intestine 
epithelial cells [37,41,42]; and E. coli HB101 transformed 
with tib loci was able to adhere to HCT 8 cells [37,42]. 
aidA and tibA encode proteins of the autotransporter 
family, type V protein secretion system of Gram-nega- 
tive bacteria [43]. Proteins of this family possess three 
domains, an N-terminal signal sequence, a passenger or 
a-domain and a translocation or P-domain, which 
enable the proteins to transport themselves to cell sur- 
faces. These three domains are all present in the puta- 
tive adhesin in L. hongkongensis, Amino acid residues 1- 
36 is the putative signal sequence (predicted by SignalP). 
As in the passenger domains of other autotransporters, 
no cysteine residues, which were thought to interfere 
with transport of the proteins to cell surfaces because of 
formation of disulphide bonds, were present in the puta- 
tive passenger domain (amino acid residues 37-756) of 
the putative adhesin in L. hongkongensis [41]. In the pas- 
senger domains of AIDA in DAEC, multiple copies of 



the consensus sequence VXNSGG, acceptor sites for 
heptose, addition of which catalyzed by AAH heptosyl- 
transferase, encoded by aah located upstream to aidA, 
are present [44] . The addition of heptose was shown to 
be essential for the adhesion properties in the tibA 
adhesin in ETEC [45]. In the putative passenger domain 
of the putative adhesin in L. hongkongensis, nine copies 
of VXSGG, but not VXNSGG, were present; and a puta- 
tive heptosyltransferase, with 52% amino acid identity to 
the TibC heptosyltransferase of ETEC, was present 
upstream to the putative adhesin gene in the L. hon- 
gkongensis genome. Interestingly, in the putative passen- 
ger domain of tibA adhesin in ETEC, 11 copies of 
VXSGG, but not VXNSGG, were present, but whether 
VXSGG is the acceptor sites for heptose has not been 
documented. In addition to their roles for adhesion, the 
passenger domains may also possess virulence functions, 
such as autoaggregation, biofilm formation, invasion and 
cytotoxicity. In the putative translocation domain, the 
consensus motif (Y/V/I/F/W)-X-(F/W) at the extreme 
carboxyl terminus of other autotransporter proteins, 
predicted to play a role in outer membrane localization 
and/or stability of these proteins, was present [41]. 

Evasion of host defense 

To protect from the active oxygen species (superoxide 
and hydrogen peroxide) released from phagocytic cells, 
the genome of L. hongkongensis encodes superoxide dis- 
mutase and catalases, in line with its catalase-positive 
phenotype. The putative superoxide dismutase of L. 
hongkongensis, which decomposes superoxide to hydro- 
gen peroxide and oxygen, is most closely related to 
those of C. violaceum, N. meningitidis and N. gonor- 
rhoeae. There are three putative catalases in the L. hon- 
gkongensis genome, encoded by a katE (encoding 
hydroperoxidase II) and two katG (encoding 
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-Ssp (Serratia marcescens) (P09489) 



-EspC (Enteropathogenic E. coli) (AAC44731) 



964 



- VacA (Helicobacter pylori) (Q48247) 



-App {Neisseria meningitidis) (CAC 14670) 



-Pertactin (Bordetella pertussis) (P14283) 



997 



-Ag43 (E. coli) (P39180) 



-AID A (Diffusely adherent £. coli) (Q03155) 



-TibA (Enterotoxigenic E. coli) (Q9XD84) 



Putative adhesin (L. hongkongensis) (LHK_01901) 

Figure 6 Phylogenetic analysis of the putative adhesin of L hongkongensis. The tree was constructed by neighbor joining method using 
Kimura's correction and bootstrap values calculated from 1000 trees. Six hundred and eight amino acid positions of the passenger domain were 
included in the analysis. The scale bar indicates the estimated number of substitutions per 20 amino acids. All names and accession numbers are 
given as cited in the GenBank database. 



hydroperoxidase I with catalase-peroxidase activity). 
These decompose hydrogen peroxide to water and oxy- 
gen. katE in L. hongkongensis is most closely related to 
the homologues in Ralstonia eutropha, whereas the two 
katG were most closely related to those in Shewanella 
amazonensis and Vibrio cholerae respectively. In addi- 
tion to protection against the active oxygen species, 
some efflux pumps may export host-derived antimicro- 
bial agents in addition to antibiotics, bile and other sub- 
stances, hence protecting from such naturally produced 
molecules of the host. 

Virulence factors 
Lipopolysaccharide 

LPS consists of three parts: lipid A, core oligosaccharide, 
and polysaccharide side chains. In E. coli, the minimal 
LPS required for growth include lipid A and two keto- 
deoxyoctulonate (Kdo) units of the core oligosaccharide. 
The LPS of wild type strains of E. coli consist of addi- 
tional core sugars and polysaccharide side chains. The 
polysaccharide side chains are also known as the O-anti- 
gen, which varies among different species of Gram-nega- 
tive bacteria and different strains of the same species. 
These sugars enhance survival during environmental 
stress, and help the bacteria evade the host immune sys- 
tem by modification of the structure. Lipid A, also 



known as the endotoxin, is the hydrophobic anchor of 
LPS. It is a glucosamine based phospholipid inserted 
into the outer membranes of most Gram-negative bac- 
teria. Most Gram-negative bacteria synthesize lipid A by 
pathways similar to the one in E. coli. Through binding 
to Toll-like receptor 4 and CD 14, lipid A of Gram-nega- 
tive bacteria trigger the synthesis and secretion of pro- 
inflammatory cytokines. The actions of these cytokines 
lead to local and systemic inflammatory responses, 
which result in various clinical manifestations, and even 
deaths, of patients. 

The same set of genes that encode enzymes in the 
biosynthetic pathways of lipid A, the two Kdo units and 
the heptose units are present in the L. hongkongensis, C. 
violaceum, N. meningitidis, N. gonorrhoeae and E. coli 
genomes. In contrast to E. coli, the lipid A of C. viola- 
ceum, N. meningitidis and N. gonorrhoeae had a symme- 
trical acylation pattern [46]. Both the reducing and 
terminal N-acetyl-glucosamine residues in these bacteria 
carry three acyl groups. The sequential addition of the 
last 12-carbon acyl group to the reducing and terminal 
N-acetyl-glucosamine residues are catalyzed by enzymes 
encoded by the htrB and msbB genes, respectively. It 
was found that msbB deletion mutants of N. meningiti- 
dis and N. gonorrhoeae had lower abilities to activate 
human macrophages to produce pro-inflammatory 
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cytokines [47-49]. Phylogenetic analysis of the experi- 
mentally confirmed htrB and msbB genes in A/", meningi- 
tidis and Af. gonorrhoeae and the putative /z£r£ and 
msbB genes in L. hongkongensis and C. violaceum 
showed that the four htrB genes and the four msbB 
genes fell into two separate clusters, with very high 
bootstrap values (Figure 7). Therefore, we speculate that 
the htrB and msbB genes in L. hongkongensis and C. vio- 
laceum serve similar functions as those in N. meningiti- 
dis and N, gonorrhoeae and that the lipid A of L. 
hongkongensis also had a symmetrical acylation pattern. 

The genes that are responsible for the synthesis of a- 
chain LI, a-chain L2, p -chain and y-chain in the core 
oligosaccharide in N. meningitidis and N. gonorrhoeae 
(IgtA, IgtB, IgtC, IgtD, IgtE, IgtF, IgtG, rfaK) and those 
for the addition of sialic acids to these chains (1st) are 
absent in the genomes of L. hongkongensis and C. vio- 
laceum [50]. On the other hand, nine genes which 
encode putative enzymes for biosynthesis of the poly- 
saccharide side chains are present in the L. hongkon- 
gensis genome. Four of these genes (rfbA, rfbB, rfbC 
and rfbD) are also present in the genomes of C. 



violaceum, N. meningitidis and N. gonorrhoeae. The 
enzymes encoded by these four genes catalyzed reac- 
tions for the synthesis of dTDP-rhamnose, although 
mutations of them in N. meningitidis and N. gonor- 
rhoeae did not result in any change in their pheno- 
types [51,52]. The other five genes (wbmF, wbmG, 
wbmH, wbml and wbmK), which encode putative 
nucleotide sugar epimerases/dehydratases and amido- 
transferase, are not present in the C. violaceum, N. 
meningitidis and N. gonorrhoeae genomes, but are 
most closely related to the corresponding genes for the 
biosynthesis of the O-antigens in Bordetella paraper- 
tussis and Bordetella bronchoseptica [53]. Although the 
structures of the LPS of L. hongkongensis and C. viola- 
ceum remain to be determined, these imply that the 
structures of the LPS of L. hongkongensis and C. viola- 
ceum are probably quite different from those of the 
lipooligosaccharides of N. meningitidis and N. 
gonorrhoeae. 

Recently, a number of genes that encode proteins for 
the assembly and transport of LPS in E. coli have been 
discovered [54]. All these genes were also present in the 
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- HtrB R. typhi (YP 067645) 



• HtrB N. gonorrhoeae (YP_207278) 



HtrB N. meningitidis (NP_274798) 



HtrB L. hongkongensis (LHK 00580) 



• HtrB C. violaceum (NP 900632) 



-MsbB N. gonorrhoeae (YP_207324) 



-MsbB N. meningitidis (NP_274430) 



MsbB L. hongkongensis (LHK 00581) 



MsbB C. violaceum (NP 900631) 

Figure 7 Phylogenetic analysis of confirmed/putative HtrB and MsbB of L hongkongensis, C. violaceum, N. meningitidis and N. 
gonorrhoeae. The tree was constructed by neighbor joining method using Kimura's correction and was rooted using HtrB of Rickettsia typhi 
(YP_067645). Two hundred and eighty-four amino acid positions were included in the analysis. The scale bar indicates the estimated number of 
substitutions per 20 amino acids. Numbers at nodes indicated levels of bootstrap support calculated from 1000 trees. All names and accession 
numbers are given as cited in the GenBank database. 
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Table 1 Genes for assembly and transport of LPS 



Protein 



Gene E. coli str. K-1 2 L hongkongensis C violaceum 
substr. MG1655 



N. gonorrhoeae N. meningitidis 
FA1090 MC58 



Periplasmic LPS-binding protein IptA b3200 LHK_02023 CV3330 NGO1606 NMB0355 

Lipopolysaccharide export, IM-tethered IptC b3199 LHK_02022 CV3329 NGO1607 NMB0354 

periplasmic protein of LptBFGC export 

complex 

Lipopolysaccharide export ABC transporter IptB b3201 LHK_02024 CV3331 NGO1605 NMB0356 

ATP-binding protein of LptBFGC export 

complex 

Lipopolysaccharide export ABC permease of IptF b4261 LHK_01413 CV2915 NG01228 NMB1570 

LptBFGC export complex 

Lipopolysaccharide export ABC permease of IptG b4262 LHK_01412 CV2916 NG01229 NMB1571 

LptBFGC export complex 

LPS assembly OM complex LptDE, beta-barrel IptD b0054 LHK_03193 CV4229 NG01715 NMB0280 

component 

LPS assembly OM complex LptDE, lipoprotein IptE b0641 LHK_001 18 CV0506 NGO0282 NMB0707 

component 



genomes of L. hongkongensis, C. violaceum, N. meningi- 
tidis and N. gonorrhoeae (Table 1). The exact functions 
of these proteins have not been fully elucidated. 
Cytotoxins 

The L. hongkongensis genome contains a number of 
CDSs that encode putative cytotoxins. These include 
cell surface acting cytotoxins, such as RTX toxin and 
hemolysins; and intracellular cytotoxins such as patatin- 
like proteins. 

RTX toxins RTX toxins, originally discovered in E. coli 
(a-hemolysin) [55,56], are most commonly found in bac- 
teria of the Pasteur ellaceae family. Most RTX toxins are 



hemolysins or leukotoxins [57,58]. The L. hongkongensis 
genome contains an RTX gene cluster (tolC-rtxAl-rtxD- 
rtxB) and an isolated rtxA2 gene. In the RTX gene cluster 
(Figure 8), tolC encodes the outer membrane component 
of the type I secretion apparatus, rtxAl encodes the struc- 
tural toxin, rtxD encodes the adaptor protein anchored to 
the inner membrane and rtxB encodes the inner mem- 
brane ATPase. TolC, RtxD and RtxB form the secretion 
apparatus for exporting RtxA. Similar to RtxA of other 
bacteria, RtxAl and RtxA2 of L. hongkongensis possess 
tandem arrays of glycine-rich nonapeptide repeats 
(GGXGXDX[L/I/V/W/Y/F]X, where X is any amino acid) 



1 

14 

Type I secretion 



10 



I 

12 



16 (kb) 



Type I secretion 



Structural toxin 



«-539 aa-^ 




\4 1087 aa 
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A 1287 aa ► 




<-469 aa> 


<— 111 aa — ► 


tolC 




rtxAl 


Putative adhesion protein 




rtxD 


rtxB 



VRTGQSYVGD QGDNWNGSD GKDSLSGGAG NDTLHGGAGN DKLYGGEGDD 



RLFGDQGSDL IQ GGAGNDIL T GDGTGLTSG WGTDLFKWTL GDQGDPGKPA 
HDVITDFGAG GVKDVLDLRD LLQGETHTGN DVGNLLNYMH FSKQADGSTL 
IQISHTGGFG SDGYQANEAS RVTQEIVLKN VDLVSNGSGS FSDTEIVKKL 



NFESVTTTGT 
VTGSPLWTL 
TSGGNFEAVT 
TDHAVTGSPL 
SISGTSGGNF 
YTVSTDHAVT 
QQSVSISGTS 
GSITYTVSTD 
QGAVQQSVSI 
HgGGSITYT 
DAYSQGAV QQ 

VRSDDAYSQG 

VTVDVRSDDA 

SSGSVTVGVR 



VSNSV mWBlB 

SNGSKVIIPE 
TAGTVSNSVjJ 

vvtlsngskv 
eavttagtvs 
gsplwtlsn 
ggnfeavttt 
havsgsplvv 
sgtsggnfea 
vstdhavtgs 
svsisgtsgg 
itytvstdha 
avqqsvsisg 
@ggsitytvs 
ysqgav qqsv 
;gsit 
sddaysqgav 



MidtMiilfcfcMiFMrf»Wfgaal itytvstdhv 

GASSGSVTVG VRSDDAYSQG PV QQNVS I SG 

hWcliiMiliMit W^AdifcWAWMah c t tytvs 
I I P EGASSGS VTVGVRSDDA YSQGP VQQNV 

SSGSVTVDVR SDDAYSQGAV 



GREVSIP VGA 
GTVSNSV[33SI 

TLSNGREVSI 
VTTTGTVSNS 
PLVVTLSNGR 
NFEAVTTTGT 
VTGSPLWTL 
TSGGNFEAVT 
TDHAVTGSPL 
SISGTSGGNF 
YTVSTDHAVT 
QQSVSISGTS 
GSITYTVSTD 



PVGASSGSVT VDVRSDDAYS 



EVS I P VGASS GSVTVDVRSD 

vsns v iBBTdiMii JikMJEE13ia 

SNGREVS I P V GASSGSVTVD 

VVTLSNGREV SIP VGASSGS 
ESVTTTGTVS NSV[0SI 
GSPLWTLSN 
GGNFESVTTA 
HAVTGSPLW 



I GREVSIPVGA 
i GTVSNSV fflBSI 

' TLSNGREVSI 



Figure 8 Genetic organization of the RTX gene cluster {tolC-rtxAl -rtxD-rtxB) in L hongkongensis. The boxes represent the CDSs. The 
number of amino acid residues of each gene is indicated above the boxes. The basic functional activities of the corresponding gene products 
are given on the top. Five copies of glycine-rich nonapeptide repeats (GGXGXDX[I7IA//W/Y/F]X, where X is any amino acid) of rtxAl are 
underlined. An CDS of unknown function, located between rtxAl and rtxD, are also depicted, where nine repeats of 22 amino acids are 
highlighted. The relative positions of each gene are assigned as predicted by nucleotide sequence analysis. 
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for binding of calcium ions (Figure 8). There are five nona- 
peptide repeats in RtxAl and nine nonapeptide repeats in 
RtxA2. Unlike most other bacteria which contain rtxC 
genes, the RTX gene cluster of L. hongkongensis does not 
possess this gene. Instead, it contains a gene of putative 
adhesive function, located between rtxAl and rtxD. 
Domain search using InterProScan showed that this gene 
contains nine repeats of 22 amino acids 
(TDNGTVTNVTLSSVTNGQTVAE) with parallel beta- 
helix structures. Each repeat is separated from the adjacent 
one by 82 amino acids (Figure 8). Although the genomes 
of L. hongkongensis, C. violaceum and N. meningitidis all 
contain RTX toxin, RtxAl and RtxA2 of L. hongkongensis 
do not show clustering with the homologues in C. viola- 
ceum and N. meningitidis. This is in contrast to the other 
genes [tolC, rtxD and rtxB) in the RTX gene cluster, 
which are all most closely related to the corresponding 
homologues in C. violaceum and other species of P-pro- 
teobacteria [59,60] (Figure 9, 10, 11, and 12). Moreover, 
the amino acid identities between TolC, RtxD and RtxB 
and their homologues in C. violaceum are much higher 
than those between RtxAl or RtxA2 and their homologues 
in any other bacteria (Figure 9, 10, 11, and 12). These sug- 
gest that rtxAl and rtxA2 have evolved much faster than 
tolC, rtxD and rtxB, so that the toxins can bind to their 
corresponding host cells more efficiently. Interestingly, 
similar to rtxA2 of L. hongkongensis, the structural toxin 
genes ifrpC and frpA) in N. meningitidis are not linked to 
genes of the type I secretion system. However, it has been 
shown that FrpC and FrpA can be secreted by E. coli 



harboring hlyBD genes, indicating that they are probably 
secreted by secretion systems unlinked to their corre- 
sponding genes [61]. 

Hemolysins In the L. hongkongensis genome, there are 
two gene loci that encode putative hemolysins. The first 
putative hemolysin contains three domains, the first one 
of the DUF21 superfamily, the second one of the 
CBS_pair superfamily and the third one of the Cor- 
C_HlyC superfamily. Among the five most closely 
related protein sequences, three of them were putative 
hemolysins of three different Yersinia species, and the 
other two were hypothetical proteins. The second puta- 
tive hemolysin belongs to the Hlylll superfamily, which 
contains seven transmembrane domains with conserved 
amino acid residues present. It is most closely related to 
the hemolysin III of C. violaceum. 

Patatin-like protein Patatin, originally described in 
plants such as potatoes, has diverse functions such as 
storage glycoproteins [62], signal transduction [63] and 
defense against parasites [64]. In 2003, it was found that 
toxin ExoU of P. aeruginosa, delivered to eukaryotic 
cells via a type III secretion system, possessed the cataly- 
tic domains of patatin, iPLA(2) and cPLA(2) [65]. Direct 
injection of ExoU in mammalian cells resulted in irre- 
versible damage to cellular membranes and rapid necro- 
tic death [66] . Similar to patatin, ExoU of P. aeruginosa 
possessed phopholipase A2 activity. P. aeruginosa 
mutants with mutations at the active sites of the pata- 
tin-like protein were less virulent than wild type P. aeru- 
ginosa in a mouse model [67]. Subsequently, genes that 



Escherichia coli (NP 286228.1) 

Methylibium petroleiphilum (YP OO 102 1073.1) 

—Dechloromonas aromatica (YP 283932.1) 
-Polaromonas sp. (YP 547173.1) 
-Rhodoferax ferrireducens (YP 524993.1) 



-Azoarcus sp. (YP 157994.1) 

Lutiella nitroferrumgi (ZP 03696615) 



Chromobacterium violaceum (NP 899978.1) 
-Laribacter hongkongensis (LHK 02915) 

Chlorobium phaeobacteroides (YP 9 12285.1) 
rBordetella parapertussis (NP 883299.1) 



Bordetella bronchiseptica (NP 887737.1) 
-Pseudoalteromonas atlantica (YP 662226.1) 



Figure 9 Phylogenetic analysis of TolC in the RTX gene cluster of L hongkongensis. The tree was constructed by neighbor joining method 
using Kimura's correction and bootstrap values calculated from 1000 trees. Four hundred and forty-two amino acid positions were included in 
the analysis. The scale bars indicate the estimated number of substitutions per 20 amino acids. All names and accession numbers are given as 
cited in the GenBank database. 
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759 



-Leptothrix cholodnii (YP_00 1789785) 

Laribacter hongkongensis (LHK_02918) 
— Vibrio mimicus (EEY44181) 
Oceano spirillum sp. (ZP_0 1166963.1) 



Comamonas testosterone (ZP_03544226.1) 

Variovorax paradoxus (YP 002 946474) 

Methylophaga thiooxidans (ZP_05105371.1) 

—Pseudomonas entomophila (YP_609419.1) 
'Pseudomonas fluorescens (YP_347 195.1) 



-Methylovorus sp. (YP_003050785) 



Chlorobium phaeobacteroides (YPOO 1960066) 

Figure 10 Phylogenetic analysis of RtxAl in the RTX gene cluster of L hongkongensis. The tree was constructed by neighbor joining 
method using Kimura's correction and bootstrap values calculated from 1000 trees. One thousand and eighty-seven amino acid positions were 
included in the analysis. The scale bars indicate the estimated number of substitutions per 20 amino acids. All names and accession numbers are 
given as cited in the GenBank database. 



Escherichia coli (CAM84375.1) 

Rhodoferax ferrireducens (YP 525000.1) 

Polaromonas sp. (YP_547164.1) 

-Azoarcus sp. (YP 158001.1) 

Bordetella parapertussis (NP_883296.1) 

-Dechloromonas aromatica (YP 286401.1) 
-Lutiella nitroferrum (ZP 03696608) 

Chromobacterium violaceum (NP 899976.1) 

Laribacter hongkongensis (LHK 02921) 



Methylibium petroleiphilum (YP OO 102 1069.1) 

Pseudomonas putida (NP 742335.1) 

-Pseudomonas mendocina (YP 001 187298.1) 



Figure 11 Phylogenetic analysis of RtxD in the RTX gene cluster of L hongkongensis. The tree was constructed by neighbor joining 
method using Kimura's correction and bootstrap values calculated from 1000 trees. Four hundred and fifty-two amino acid positions were 
included in the analysis. The scale bars indicate the estimated number of substitutions per 20 amino acids. All names and accession numbers are 
given as cited in the GenBank database. 
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— Escherichia coli (CAM84374.1) 
— Rhodoferax ferrireducens (YP 525001.1) 
—Polaromonas sp. (YP 547163.1) 
-Azoarcus sp. (YP 932150.1) 

Methylibium petroleiphilum (YP_00 102 1068.1) 



-Dechloromonas aromatica (YP 286402.1) 
iooo iBordetella parapertussis (NP 883297.1) 
^Bordetella bronchiseptica (NP 887735.1) 

Lutiella nitroferrum (ZP03 696607) 

Chromobacterium violaceum (NP 899977.1) 
Laribacter hongkongensis (LHK 02922) 



— Burkholderia ambifaria (YP 778251.1) 
-Ralstonia eutropha (YP 728852.1) 



Figure 12 Phylogenetic analysis of RtxB in the RTX gene cluster of L hongkongensis. The tree was constructed by neighbor joining 
method using Kimura's correction and bootstrap values calculated from 1000 trees. Seven hundred and twenty amino acid positions were 
included in the analysis. The scale bars indicate the estimated number of substitutions per 20 amino acids. All names and accession numbers are 
given as cited in the GenBank database. 



encode putative patatin-like proteins were observed in 
many bacterial genomes, although none of them was 
characterized phenotypically [68]. It was also observed 
that the average copy number of genes that encode 
patatin-like proteins is higher in plant/animal bacterial 
pathogens than in non-pathogens [68]. In some patho- 
gens, up to eight copies of genes that encode putative 
patatin-like proteins can be found. Similar to P. aerugi- 
nosa, the genome of L. hongkongensis also contains 
three copies of genes that encode putative patatin-like 
proteins. The lengths of the genes that encode putative 
patatin-like proteins in the genomes of L. hongkongensis, 
C. violaceum (7 copies), N. meningitidis (1 copy) and N. 
gonorrhoeae (1 copy) varied from 894 to 2,337 bp. The 
three copies in the L. hongkongensis genome are 951, 
963 and 2,232 bp respectively. All three copies contain 
all the four domains that can be found in bacterial pata- 
tin-like proteins, including a putative oxyanion hole, a 
serine hydrolase G-X-S-X-G domain, a potential serine- 
containing phosphorylation site and an aspartate-con- 
taining active site domain (Figure 13). The serine in the 
hydrolase domain and the aspartate made up a patatin- 
specific catalytic dyad that has not been described in 
any other known proteins [68]. 
Enzymes 

Outer membrane phospholipase A It has been shown 
that outer membrane phospholipase A (OMPLA) is a 
virulence factor in a number of bacteria, including 



Helicobacter pylori and C. coli. Located on the outer 
membrane of bacteria, OMPLA lyses the outer mem- 
brane, leading to release of other virulence factors, such 
as urease and VacA in H. pylori. In the L. hongkongensis 
genome, a gene that encodes a putative OMPLA is 
observed. This OMPLA possesses a complete and highly 
specific consensus sequence motif (YTQ-X n -G-X 2 -H-X- 
SNG) found in OMPLA of other bacteria. Phylogeneti- 
cally, it is most closely related to the OMPLA of Methyli- 
bium petroleiphilum, a methyl tert-butyl ether-degrading 
methylotroph of P-proteobacteria (Figure 14) [69]. 

Drug resistance 

A genome-wide analysis using similarity searches revealed 
the presence of a large number of antibiotic resistance- 
related genes in L. hongkongensis strain HLHK9. They are 
related to p -lactam (Table 2), multidrug efflux (Table 3) 
and other resistance genes (Table 4). 
fi-lactam resistance-related genes 

A total of 10 CDSs related to p -lactam resistance were 
identified in the L. hongkongensis genome. Genes that 
exhibit similarity to penicillin-binding proteins (PBPs) (6 
CDSs) of other bacterial species were found (Table 2). 
The PBPs identified in L. hongkongensis include PBPla, 
PBP2, PBP3, PBP4a, PBP6a, and PBP7, which are essen- 
tial proteins that are involved in biosynthesis of murein 
and peptidoglycan, and are targets for inhibition by p- 
lactams [70,71]. Although the presence of PBPs per se 
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No. 


Organism 


Name of PLP and function 


Accession no. 


CO 


Block I (oxyanion hole) 


1 


Pseudomonas aeruginosa PA103 


ExoU, phospholipase A 


AAC 16023 


109 


SGGGAKGAAYPGAMLA 


2 


Chromobacterium violaceum ATCC 12472 


latex allergen from Hevea brasiliensis 


NP 900111 


12 


DGGGLRGIIALVILDR 


3 


Chromobacterium violaceum ATCC 12472 


hypothetical protein 


NP 900364 


35 


QGGGALGAYQAGVYQG 


4 


Chromobacterium violaceum ATCC 12472 


hypothetical protein 


NP 900401 


9 


GGGGSKGIVYPQAIEV 


5 


Chromobacterium violaceum ATCC 12472 


conserved hypothetical protein 


NP 902385 


78 


GGGGARGFAHLGVLKE 


6 


Chromobacterium violaceum ATCC 12472 


hypothetical protein 


NP 902513 


15 


QGGGAHGAFTWGVLDA 


7 


Chromobacterium violaceum ATCC 12472 


hypothetical protein 


NP902864 


11 


SGGGARAAYQVGVLLG 


8 


Chromobacterium violaceum ATCC 12472 


probable transmembrane protein 


NP904044 


46 


GGGAVKGFAH I GV I KV 


9 


Laribacter hongkongensis HLHK9 


hypothetical protein 


LHK00116 


29 


EGGGQRGIFTAGVLDA 


10 


Laribacter hongkongensis HLHK9 


hypothetical protein 


LHK01938 


38 


GGGGARGFAHLGVLEE 


11 


Laribacter hongkongensis HLHK9 


probable transmembrane protein 


LHK03113 


57 


GGGAAKGFAHVGVIKL 


12 


Neisseria gonorrhoeae FA 1090 


hypothetical protein 


YP208973 


50 


GGGASKGFAH I GI VKV 


13 


Neisseria meningitidis MC58 


conserved hypothetical protein 


AAF42447 


47 


GGGASKGFAHVGI IKV 



No. 


CO 


Block II (serine hydrolase motif) 


CO 


Block III (conserved proline) 


CO 


Block IV (active site aspartic acid) 


CO 


1 


24 


GS SAGG I T AALLAS GMS PAAF 


85 


VAQAAH I SGS FPGVFQK 


15 


EFQDGGVMINVP 


335 


2 


16 


GTSTGALIALGLARGMTPRQI 


103 


AWRVAMRS SAAPTYFPS 


2 


GFADGGVFANNP 


123 


3 


12 


G I S I GALNAA I IAGNPPDKRA 


120 


RAEHFMASGALPPGFPA 


6 


HYWDGGMVSNTP 


160 


4 


18 


GSSAGALTGYILCMGGTTTAL 


209 


LLSAAVASGSYPLLFSP 


12 


IYTDGGCLSNLP 


163 


5 


12 


GTSAGALIGGIYANGLPLDEM 


100 


LARALRASMAVPGVFDL 


6 


LLVDGAIARNVP 


516 


6 


13 


GT S AGAMNAVAVAH GL C L GG P 


102 


CADALLASACLPSLQQA 


6 


PYWDGGFAAN-P 


142 


7 


17 


GTSAGAINAVALASGAGNYQL 


117 


GLDHLMATAAIPLIFPS 


6 


FY CDGAVRQ L S P 


162 


8 


12 


GTSAGSVVGSLYASGLNGMQL 


67 


TGQAVRASASIPNVFLP 


6 


SYVDGGLVSPVP 


102 


9 


13 


GTSAGAQTLSAYACGQHGYSR 


75 


WLASLKASSAIPLLYRN 


7 


RYVDGGVRDAIP 


126 


10 


12 


GASAGALIGGFYAAGRPVAEL 


100 


LATALRASMAVPGLFDP 


6 


LLVDGGLARQLP 


521 


11 


12 


GTSAGSFVGALYAAGYSGFQL 


67 


T GQAVRAS C S I PN I FQ P 


6 


RYVDGGLSEPVP 


112 


12 


12 


GTSAGSIVGSLLASGMSPDRL 


67 


AGQAVRASAA I PN VFQ P 


6 


KYVDGGLSQPVP 


89 


13 


12 


GTSAGSIVGSLFASGMSPDRL 


67 


AGQAVRASAA I PN VFQ P 


6 


TYVDGGLSQPVP 


89 



Figure 13 Multiple alignments of the four conserved domains in the putative patatin-like proteins in the genomes of L 

hongkongensis, C violaceum, N. meningitidis and N. gonorrhoeae. The two arrows indicate the Ser-Asp catalytic dyad. Conserved amino 

acids in the four domains are in bold, co, number of amino acids before and after the conserved domains. 
\ . J 



does not confer resistance, chromosomal mutations in 
PBPs may render the bacteria resistant to P -lactams 
[72-75]. 

Apart from the ampC gene (LHK_03028) that 
encodes the previously characterized class C P -lacta- 
mase [76], there are two other putative p -lactamases 
(LHK_00876 and LHK_00878) observed in the L. hon- 
gkongensis genome. They are both putative metallo-P - 
lactamases containing a metallo-P -lactamase superfam- 
ily domain which included two zinc ligand-binding 
sites essential for its hydrolytic function on the P -lac- 
tam ring (Figure 15) [77-79]. However, these zinc 
ligand-binding sites were also present in most proteins 
of the metallo-P -lactamase superfamily, the function of 
which is not limited to P-lactam hydrolysis [79-81]. 
Therefore, in vitro experiments are required to confirm 



the actual function of these two putative metallo-P - 
lactamases. 

Multidrug resistance genes 

A total of 54 CDSs related to multidrug efflux were 
identified in L. hongkongensis genome (Table 3). The 
five major families of drug extrusion translocases were 
all present, including the Major Facilitator Superfamily 
(MFS) (7 CDSs), Small Multidrug Resistance (SMR) 
family (2 CDSs), RND family (7 CDSs), Multidrug and 
Toxic compound Extrusion (MATE) family (2 CDSs), 
and ATP-Binding Cassette (ABC) superfamily (6 CDSs). 
Resistance-Nodulation-cell Division (RND) family proteins 
For Gram-negative bacteria, the efflux pumps that are 
associated with most clinically significant resistance to 
antibiotics are those of the RND family. In this family, 
three gene loci homologous to acrRAB-tolC 
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Helicobacter pylori (NP_223 169. 1) 

- Thiobacillus demtrificans (YPJ 1 3854 . 1 ) 

Desuljbbacterium autotropkeum (YP_00 260 2310.1) 

Nitrococcus mobilis (ZP_0 1 1 2 549 2. 1 ) 



-Limnobacter sp. (ZP_0 1914196.1) 

Acinetobacterbawnanmi (YP_001713538. 1) 

Stenotrophomonas maltopklia (YPJ 020 29034. 1) 

Xanthomoms oryzae (ZP_02243889. 1) 



-Mariprofitndusjerrooxydaris (ZP_0 145 324 8. 1) 

—Laribacter hongkongensis (LHK_00790) 

Methylibium petroleiphlum (YP_001 022693. 1) 



Lutiella nilroferrum (ZP_03697836. 1) 

Eikenella corrodens (ZP_0 3714429 . 1 ) 

Psychrobactercryohalolentis (YP_579528. 1) 

Figure 14 Phylogenetic analysis of outer membrane phospholipase A of L hongkongensis. The tree was constructed by neighbor joining 
method using Kimura's correction and bootstrap values calculated from 1000 trees. Three hundred and seventy-seven amino acid positions were 
included in the analysis. The scale bar indicates the estimated number of substitutions per 10 amino acids. All names and accession numbers are 
given as cited in the GenBank database. 



(LHK_00138, LHK_00140-00142; LHK_02129-02132; 
LHK_02825-02828) and one gene locus homologous to 
acrAD-tolC (LHK_02929-02931) of Escherichia coli 
were identified in the genome of L. hongkongensis. 
These three AcrRAB-TolC and the AcrAD-TolC 



multidrug efflux systems shared typical tripartite struc- 
ture with other multidrug efflux systems in the RND 
family [82]. AcrB and AcrD are membrane transporter 
proteins, AcrA is membrane fusion protein and TolC 
is outer membrane channel protein. acrR is a 



Table 2 CDSs related to beta-lactam antibiotics in L hongkongensis 



CDS 


Gene 


Product 


Organism with the closest matching 
sequences 


E-value 


Identities 


Remarks 3 


LHK_00876 




(3-lactamase domain protein 


Thauero sp. 


6e-77 


135/204 
(66%) 




LHK_00878 


gloB 


Hydroxyacylglutathione hydrolase 


Rickettsiello grylli 


6e-64 


126/259 
(48%) 




LHK_00975 


docC 


D-alanyl-D-alanine- 
carboxypeptidase 


C. violoceum 


e-140 


254/379 
(67%) 


PBP6a 


LHK_02726 


pbpG 


D-alanyl-D-alanine-endopeptidase 


C. violoceum 


4e-94 


183/288 
(63%) 


PBP7 


LHK_02764 


pre 


Carboxy-terminal processing 
protease 


C. violoceum 


1e-173 


315/480 
(65%) 


PBP3 processing 
protease 


LHK_02836 


docB 


Serine-type D-Ala-D-Ala 
carboxypeptidase 


C. violoceum 


3e-81 


207/427 
(48%) 


PBP4a 


LHK_02959 


mrcA 


Peptidoglycan glycosyltransferase 


C. violoceum 


0 


512/795 
(64%) 


PBP1a 


LHK_03028 


ompC (3-lactamase 


C. violoceum 


7e-91 


189/381 
(49%) 




LHK_03062 


ftsl 


Penicillin-binding protein 3 
precursor 


C. violoceum 


0 


349/586 
(59%) 


PBP3 


LHK_03073 


mrdA 


Penicillin-binding protein 2 


C. violoceum 


0 


404/583 
(69%) 


PBP2 



a PBP, penicillin-binding protein 
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Table 3 CDSs related to multidrug resistance in L. hongkongensis 



CDS 



Gene Product 



Organism with the closest 
matching sequence 



E-value 



Identities No. of 
TMS a 



Remarks b 



LHK_00138 tolC TolC family type I secretion outer 

membrane protein 

LHK_00140 acrB Acriflavin resistance protein 

LHK_00141 acrA Efflux transporter, RND family, MFP 

subunit 

LHK_00142 arsR Transcription regulator ArsR 

LHK_00221 RND efflux system outer membrane 

lipoprotein 

LHK_00222 macB Macrolide-specific ABC-type efflux carrier 

LHK_00223 macA Efflux transporter, RND family, MFP 

subunit 

LHK_00466 Hypothetical protein 

LHK_00743 mdfA Probable multidrug translocase protein 

LHK_01 214 Probable multiple antibiotic resistance 

protein MarC 

LHK_01285 mdtA Probable membrane protein 

LHK_01286_01287 mdtB Drug efflux pump transmembrane protein 

LHK_01288 mdtC Drug efflux pump transmembrane protein 

LHK_01289 tolC Putative outer membrane protein 

precursor 

LHK_01373 emrB Multidrug resistance protein 

LHK_01374 emrA Multidrug efflux membrane fusion protein 

LHK_01375 mdtP Outer membrane efflux protein 

LHK_01376 emrR MarR family transcriptional regulator 

LHK_01383 Probable multiple antibiotic resistance 

protein MarC 

LHK_01384 mdtJ Multidrug efflux system protein MdtJ 

LHK_01385 mdtl Multidrug efflux system protein Mdtl 

LHK_01424 RND efflux system, outer membrane 

lipoprotein, NodT family 

LHK_01425 Transporter, hydrophobe/amphiphile 

efflux- 1 (HAE1) family 

LHK_01426 Efflux transporter, RND family, MFP 

subunit 

LHK_01870 Putative multidrug resistance protein 

LHK_01934 Probable multiple antibiotic resistance 

protein MarC 

LHK_01967 ABC transporter, transmembrane region: 

ABC transporter related 

LHK_02051 Lipoprotein releasing system, ATP-binding 

protein 



Polaromonos 2e-72 
naphthalenivorans 

P. naphthalenivorans 0 

P. naphthalenivorans 2e-51 

Bordetella parapertussis 2e-26 

Pelobacter propionicus 1 e-1 08 
Bordetella avium 0 

Lutiella nitroferrum 4e-127 

Dorea longicatena 6e-45 

C violaceum e-1 39 

C violaceum 7e-58 

C. violaceum 3e-94 
C. violaceum 0 
C violaceum 0 

Acinetobacter sp. 4e-79 
C violaceum 0 

Ralstonia eutropha e-1 01 

Yersin ia en terocolitica 1 e-3 8 

C. violaceum 4e-32 

C violaceum 4e-80 

Klebsiella pneumoniae 3e-20 

Salmonella enterica 6e-21 
Syntrophobacter fumaroxidans 7e-108 
Pelobacter propionicus 0 

S. fumaroxidans 7e-95 

R. eutropha 8e-47 

C violaceum 3e-80 
R. eutropha 0 

Pseudomonas stutzeri 4e-58 



203/455 
(44%) 

723/1066 
(67%) 

153/355 
(43%) 

62/99 
(62%) 

207/424 
(48%) 

429/655 
(65%) 

252/384 
(65%) 

131/439 
(29%) 

253/394 
(64%) 

118/208 
(56%) 

188/340 
(55%) 

700/1018 



678/994 



189/433 
(43%) 

323/490 
(65%) 

194/370 
(52%) 

142/469 
(30%) 

69/156 
(44%) 

149/232 
(64%) 

52/119 
(43%) 

63/109 
(57%) 

223/446 
(50%) 

553/1036 
(53%) 

186/364 
(51%) 

153/483 
(31%) 

152/205 
(74%) 

550/732 
(75%) 

134/227 
(59%) 



13 



12 



12 



12 



10 



14 



12 



14 



OMP 
RND 
MFP 

TR 
OMP 
ABC 
MFP 
MATE 
MFS 
MarC 
MFP 
RND 
RND 
OMP 
MFS 
MFP 
OMP 

TR 
MarC 
SMR 
SMR 
OMP 
RND 
MFP 
MFS 
MarC 
ABC 
ABC 
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Table 3 CDSs related to multidrug resistance in L hongkongensis (Continued) 



LHK 


_02129 


mexA 


Multidrug resistance protein 


Xanthomonas campestris 


3e-104 


223/375 


- 


MFP 














(59%) 






LHK 


_02130 


acrB 


AcrB/AcrD/AcrF family protein 


Cellvibrio joponicus 


0 


768/1034 


14 


RND 














(74%) 






LHK 


_02131 


nodT 


RND efflux system, outer membrane 


Geobocter metollireducens 


8e-142 


266/466 


- 


OMP 








lipoprotein, NodT 






(57%) 






LHK 


_02132 




Transcriptional regulator, TetR/AcrR family 


Cellvibrio joponicus 


2e-46 


93/187 


- 


TR 














(49%) 






LHK 


_02173 




Probable MFS transporter 


C violoceum 


1e-82 


195/370 


12 


MFS 














(52%) 






LHK 


_02235 




Putative integral membrane efflux protein 


Yersinia pestis 


0 


379/505 


13 


abgT 














(75%) 




family 


















protein 


LHK 


_02238 




ABC transporter 


Azoarcus sp. 


e-157 


281/371 


7 


ABC 














(75%) 






LHK 


_02239 


yhiH 


ABC transporter related 


Thauera sp. 


0 


699/954 


6 


ABC 














(73%) 






LHK 


_02240 




Conserved hypothetical protein, predicted 


Azoarcus sp. 


6e-96 


232/339 


- 


MFP 








secretion protein HlyD family 






(68%) 






LHK 


_02241 


oprM3 


Outer membrane efflux protein 


B. avium 


1e-100 


241/453 


- 


OMP 














(53%) 






LHK 


_02292 




Probable multiple antibiotic resistance 


C. violaceum 


2e-60 


116/218 


- 


MarC 








protein MarC 






(53%) 






LHK 


_02533 




Multidrug efflux protein NorA 


C. violaceum 


e-122 


230/447 


12 


MATE 














(51%) 






LHK_ 


_02539 




EmrB/QacA family drug resistance 


P. stutzeri 


2e-147 


277/481 


13 


MFS 








transporter 






(57%) 






LHK 


_02783 




Hypothetical protein 


C. violaceum 


5e-74 


139/200 


- 


MarC 








multiple antibiotic resistance (MarQ- 






(69%) 












related protein 












LHK_ 


_02825 


notC 


Periplasmic type I secretion system 


C. violaceum 


4e-82 


190/439 


- 


OMP 














(43%) 






LHK 


_02826 


acrB 


Probable transmembrane drug efflux 


C. violaceum 


0 


693/1019 


12 


RND 








protein 






(68%) 






LHK 


_02827 


acrA 


Probable transport/efflux transmembrane 


C. violaceum 


3e-83 


174/351 


- 


MFP 








protein 






(49%) 






LHK 


_02828 


acrR 


TetR/AcrR family transcriptional regulator 


C. violaceum 


6e-44 


92/183 


- 


TR 














(50%) 






LHK 


_02929 


acrA 


Probable multidrug efflux membrane 


C. violaceum 


1e-89 


203/372 




MFP 








permease 






(54%) 






LHK 


_02930 


ocrD 


Acriflavin resistance protein D 


C. violaceum 


0 


717/1036 


12 


RND 














(69%) 






LHK 


_02931 


oprM 


Outer membrane efflux protein 


C. violaceum 


e-136 


252/467 


- 


OMP 














(53%) 






LHK_ 


_02949 


msbA 


Transport ATP-binding protein MsbA 


C. violaceum 


0 


344/554 


5 


ABC 














(62%) 






LHK 


_02975 


bcr 


Probable MFS transporter 


C. violaceum 


e-147 


269/388 


12 


MFS 














(69%) 






LHK 


_03132 


emrB 


Probable multidrug resistance protein 


C. violaceum 


0 


303/492 
(61%) 


14 


MFS 


LHK 


_03133 


emrA 


Multidrug resistance protein 


Burkholderia thailandensis 


1e-108 


202/377 




MFP 














(53%) 






LHK 


_03134 


tolC 


Outer membrane efflux protein 


R. eutropha 


6e-45 


153/453 




OMP 














(33%) 







a TMS, transmembrane segment domain 

b RND, resistance-nodulation-division family; MFS, major facilitator superfamily; ABC, ATP-binding cassette transporter superfamily; SMR, small multidrug resistance 
family; MATE, multidrug and toxic compound extrusion; MFP, membrane fusion protein; OMP, outer membrane (channel) protein; TR, transcription regulator; 
MarC, MarC-like protein. 
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Table 4 Miscellaneous resistance genes in L hongkongensis 



CDS 




Gene 


Product 


Organism with the closest matching 
sequence 


E-value 


Identities 


LHK_ 


00025 


ksgA 


Dimethyladenosine transferase 


C. violoceum 


1e-94 


1 78/260 
(68%) 


LHK_ 


00913 




Arsenical-resistance protein 


Burkholderia oklahomensis 


2e-65 


153/183 
(83%) 


LHK_ 


01038 


crcB 


Camphor resistance protein CrcB 


Brucella abortus 


2e-33 


88/129 (68%) 


LHK_ 


01039 


crcB 


Camphor resistance protein CrcB 


Y. pest is 


3e-34 


78/123 (63%) 


LHK_ 


01350 


rarD 


RarD protein, chloamphenicol sensitive 


C. violaceum 


2e-81 


1 72/285 














(60%) 


LHK_ 


02940 


bacA 


Undecaprenol kinase, putative bacitracin resistance 


Burkholderia graminis 


3e-83 


1 74/278 








protein 






(62%) 



transcription regulator gene located upstream of the 
acrAB-tolC loci. As a multidrug efflux system with 
broad-substrate spectrum, AcrAB-TolC confers resis- 
tance to chloramphenicol, tetracyclines, erythromycin, 
trimethoprim, p -lactams, and other organic and inor- 
ganic antiseptic agents in E. coli [83,84]. AcrAD-TolC 
is less commonly reported compared to AcrAB-TolC 
system, where AcrD is a close homolog of AcrB. 
AcrAD-TolC multidrug efflux system is capable of 
exporting antibiotics of the aminoglycoside class 
including amikacin, gentamicin, neomycin, kanamycin, 
tobramycin, and streptomycin in E. coli [85,86]. 
Another putative multidrug efflux system of the RND 
family identified in the genome of L. hongkongensis is 
homologous to MdtABC-TolC system (LHK_01285, 
LHK 01286, LHK 01288, LHK 01289). MdtABC-TolC 
system in E. coli confers at least novobiocin and bile 
salt resistance in the bacterium. A uniqueness of this 
system is that MdtB and MdtC will form a heterodi- 
mer as a membrane efflux component in cooperation 
with membrane fusion protein MdtA and outer mem- 
brane channel protein TolC. [27,87] Moreover, one 
RND family multidrug efflux system with homology to 
hydrophobe/amphiphile efflux-1 subfamily was also 
discovered (LHK 01424-01426). 



Major Facilitator Superfamily (MFS) 

Two loci (LHK_01373-01376; LHK_03132-03134) 
homologous to emrAB-tolC system of E. coli belonging 
to MFS were found in the genome of L. hongkongensis. 
One of them had an additional transcription regulator 
emrR gene (LHK_01376) in its upstream sequence. 
EmrAB-TolC system in E. coli confers nalidixic acid and 
other toxic novobiocin substances resistance to bacter- 
ium [88]. Moreover, mutation of the emrR gene has 
been shown to lead to over-expression of the EmrAB 
pump and increased resistance to antimicrobial agents 
[89]. However, the substrate specificity of these EmrAB- 
TolC homologs identified in the genome of L. hongkon- 
gensis is yet to be investigated. There are five other mul- 
tidrug efflux proteins belonging to MFS (LHK_00743; 
LHK_01870; LHK 02173; LHK 02539; LHK 02975) in 
the L. hongkongensis genome. One of them 
(LHK_00743) is a homolog to mdfA gene while another 
(LHK_02975) has high identities to bcr gene. mdfA 
encodes an MF-related protein, MdfA, which results in 
resistance to a diverse group of cationic and zwitterionic 
lipophilic compounds and antibiotics such as chloram- 
phenicol and erythromycin when over-expressed in E. 
coli [90]. bcr gene codes for an efflux protein which is 
associated with bicyclomycin resistance in E. coli [91]. 



LHK 


00876 


(86) LTHGHLDHVG (95) . 


. (160 


LHK 


"00878 


(100) LTHHHADHTG (109) . 


. (155 


LI 




(103) LSHAHADHAG (112) . 


. (177 


FEZ- 


1 


(88) ISHAHFDHAA ( 97 ) . 


. (164 


CAU- 


1 


(94) NTHAHYDHAG (103) . 


. (168 


BJP- 


1 


(99) NTHAHLDHTG (108) . 


. (173 



) HCPGHTPGHV(169) . . . (220) FIPGHGPEST (229) 
) ATPGHTLDHL (164) . . . (210) IACAHE YTEA (219) 
) FMAGHTPGST (186) . . . (242) LLTPHPGASN (251 ) 
) LTPGHTRGCT (173) . . . (230) FLGSHAGMFD (239) 
) LTPGHTIGCT (17 7) . . . (233 ) MLPAHEEQGN (242 ) 
) ATPGHSPGCT (182) . . . (238) LLGPHPEVYG (247) 



Figure 15 Multiple alignment of the partial amino acid sequences of the two putative metallo-p-lactamases in L hongkongensis and 
those of known metallo-p-lactamases showing the conserved zinc-ligand binding sites. Amino acid residues high-lighted in yellow and 
blue representing two independent putative zinc-ligand binding sites of class B3 metallo-(3-lactamase Hisl 1 6-Hisl 1 8-Hisl 96 and Aspl 20-Hisl 21 - 
His263. Numbers in parentheses indicate the corresponding positions in the amino acid sequences. LI, Stenotrophomonas maltophilia IID1275 
(accession no. CAA52968); FEZ-1, Legionella gormanii ATCC33297 (accession no. CAB96921); CAU-1, Caulobacter vibrioides DSM 4727 (accession no. 
CAC87665); BJP-1, Bradyrhizobium japonicum USDA 110 (accession no. NP_772870) 
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Small Multidrug Resistance (SMR) family 

Two adjacently located multidrug efflux genes 
(LHK_01384 and LHK_01385) of the SMR family were 
identified in the genome of L. hongkongensis. They are 
homologous to mdtjl (also named ydgEF) genes in E. 
coli which confers resistance to spermidine and, deoxy- 
cholate and sodium dodecyl sulfate at low level [92,93]. 
mdtjl have to be co-expressed for functionality and it is 
suggested that Mdtjl may function as a heterodimer or 
heterooligomer [92-94]. 

Multidrug and Toxic compound Extrusion (MATE) family 

Two multidrug efflux genes of the MATE family 
(LHK_00466 and LHK_02533) were also discovered in 
the genome of L. hongkongensis. One of them 
(LHK_02533) is a homolog of multidrug efflux protein 
Nor A from Staphylococcus aureus, which confers resis- 
tance to antibiotics of the quinolone class and various 
organic compounds [95,96]. Mutation of the nor A gene 
in S. aureus has resulted in 5- to 30-fold increase in sus- 
ceptibility to norfloxacin [96]. 
ATP-Binding Cassette (ABC) superfamily 
Six CDSs of the ABC transporter family related to mul- 
tidrug resistance were identified in the L. hongkongensis 
genome. A tripartite multidrug efflux system of the ABC 
transporter family composed of membrane transporter 
(LHK.02239), MFP (LHK.02240), and OMP 
(LHK_02241) was identified in the genome of L. hon- 
gkongensis. This system of proteins probably functions 
as a complex with composition resembling to that of 
RND family. Five other standalone putative ABC trans- 
porter genes (LHK.00222; LHK.01967; LHK.02051; 
LHK.02238; LHK.02949) coding for multidrug efflux 
proteins were scattered over the L. hongkongensis gen- 
ome. One (LHK_02949) of them possessed homology to 
msbA from E. coli, which is responsible for mediating 
the transport of the lipid A core of LPS to the outer 
membrane [97,98]. Interestingly, expression of E. coli 
MsbA in Lactococcus lactis which lacks LPS has been 
shown to significantly increase resistance to erythromy- 
cin [98]. 

In addition to these five major families, the L. hon- 
gkongensis genome also encodes a number of other pos- 
sible multidrug resistance-related genes. Among these, 
there are five marC-like genes (LHK_01214; 
LHK 01383; LHK 01934; LHK 02292; LHK 02783), the 
expression of which was once believed to be associated 
with multidrug efflux system MarRAB in E. coli [99]. 
However, a recent report has shown that mutation in 
marC did not increase antibiotic susceptibility on E. coli 
[100]. Therefore, the actual function of MarC is still not 
identified yet. One CDS (LHK 02235) coding for a pro- 
tein with 75% amino acid identities to putative integral 
membrane efflux protein of Yersinia pestis and posses- 
sing an AbgT family domain was also identified in the 



genome of L. hongkongensis. AbgT protein family 
includes two transporter members, AbgT protein of E. 
coli and MtrF of N. gonorrhoeae [101,102]. MtrF, as an 
inner membrane protein, which enhances the activity of 
multidrug efflux system MtrCDE of the RND family, 
conferring higher level of resistance to hydrophobic 
antibiotics such as penicillin and erythromycin etc. 
[102,103]. Since no mtrCDE gene homologs were found 
in the genome of L. hongkongensis, the role and function 
of the AbgT family protein in L. hongkongensis remains 
to be elucidated. 
Miscellaneous resistance genes 

Six other CDSs with homologies to other drug resis- 
tance genes were identified in the L. hongkongensis gen- 
ome (Table 4). A putative dimethyladenosine 
transferase, encoded by ksgA gene (LHK_00025) was 
found. Kasugamycin and streptomycin resistance as a 
result of mutations in ksgA have been documented 
[104-106]. A bacA gene (LHK 02940) encoding putative 
bacitracin resistance protein BacA was also identified. 
BacA protein confers bacitracin resistance to E. coli by 
catalyzing the dephosphorylation of undecaprenyl dipho- 
sphate (C55-PP) into C55-P, which is important in pep- 
tidoglycan synthesis. The conversion of C55-PP into 
C55-P is normally catalyzed by a specific phosphatase 
which is inhibited by bacitracin leading to halted pepti- 
doglycan synthesis [107]. The other four CDSs encode 
putative arsenical-resistance protein (LHK_00913), two 
camphor resistance proteins CrcB (LHK_01038 and 
LHK_01039), and chloramphenicol sensitive protein 
RarD (LHK_01350). Overexpression of CrcB in E. coli 
has been shown to protect the bacteria against chromo- 
some decondensation by camphor [108]. The presence 
of two crcB genes in L. hongkongensis genome, but only 
one copy in the closely related bacterium, C. violaceum, 
and none in N. gonorrheae or N. meningitidis genomes 
suggested that this is an important defense mechanism 
in L. hongkongensis. Since the L. hongkongensis strain, 
HLHK9, used for genome sequencing is susceptible to 
tetracycline (MIC = 0.5 (ig/ml), the tetA gene previously 
identified in L. hongkongensis strains resistant to tetracy- 
cline is not found in the present genome [109]. Recently, 
class 1 integrons carrying multiple antimicrobial resis- 
tance genes were identified in 6.5% of L. hongkongensis 
isolates from aquatic products in Guangzhou city, China 
[110]. However, such integron is not present in the gen- 
ome of strain HLHK9. 

Bacteriophages 

The L. hongkongensis genome (genome size 3.16 Mbp) 
contains a total of eight putative prophages named LhPl 
to LhP8, the positions of which are shown in Figure 16 
and Table 5. This high number of prophages, compared 
to 3 prophages in C. violaceum (genome size 4.75 Mbp) 
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LHK01998 LHK01366 

Figure 16 Position of the LhP prophages and the CDSs coding 
for transposases in the L hongkongensis genome. LhP1 to LhP8: 
L hongkongensis prophages 1 to 8. 



(GenBank accession no. AE016825), 1 to 3 in N. menin- 
gitidis (genome size 2.14 to 2.27 Mbp) (GenBank acces- 
sion no. CP000381, FM999788, AM421808, AE002098, 
AL157959, AM889136, CP001561) and 6 in N. gonor- 
rhoeae (genome size 2.15 to 2.23 Mbp) (GenBank acces- 
sion no. AE004969, CP001050) using the same 
parameters for prophage prediction by Prophage Finder, 
suggested that this is an important mechanism for 
acquisition and exchange of genetic materials in L. hon- 
gkongensis. While N. meningitides and C. violaceum 
cause mainly meningitis and invasive infections respec- 
tively that can lead to fatal septicemia, N. gonorrheae 
and L. hongkongensis were mainly isolated from human 



genital and gastrointestinal tract respectively. Interest- 
ingly, the presence of apparently high number of pro- 
phages also in N. gonorrhoeae is in line with our 
previous observation that horizontal gene transfer was 
particularly frequent among bacteria residing in human 
gastrointestinal and probably genital tract [111], suggest- 
ing that these anatomical sites may be an excellent incu- 
bator for bacterial gene transfer. 
LhPI 

Bacteriophage LhPI is composed of 47 CDSs, account- 
ing for 31,318 bp with G+C content 63.07%, close to the 
G + C content of the L. hongkongensis genome. LhPI 
contains 34 phage-related CDSs. Analysis of these CDSs 
indicated that LhPI is likely a P2-like phage, as 29 of its 
34 phage-related CDSs were most similar to CDSs in 
P2-like prophages (Figure 17). A P2-like phage typically 
possesses an icosahedral head with a diameter of about 
60 nm, containing a linear double-stranded DNA mole- 
cule of about 30-35 kb with cohesive ends and a straight 
tail with a contractile sheath [112]. Based on their mor- 
phology, P2-like phages are classified as members of the 
Myoviridae family (phages with contractile tails) in the 
order Caudovirales (tailed phages) [113]. Other CDSs 
exhibit similarity to other genes of phages such as Mu- 
like phages and unclassified phages under Myoviridae 
and Siphoviridae (phages with long non-contractile 
tails). 
LhP2 

Bacteriophage LhP2 is composed of 32 CDSs, account- 
ing for 26,141 bp with G+C content 64.81%. Analysis of 
its CDSs indicated that LhP2 is likely a Mu-like phage, 



Table 5 Prophages of L hongkongensis HLHK9 



Prophage 


Nucleotide Positions 


Size 
(bp) 


GC 

(%) 


No. of 
CDS 


Total No. of Phage 
related CDS 


Phage related CDS (No.) 


LhPI 


356028-387345 


31,318 


63.07 


47 


34 


P2-like (29), unclassified Myoviridae (2), unclassified Siphoviridae 
(2), Mu-like (1) 


LhP2 


1109928-1136068 


26,141 


64.81 


32 


25 


Mu-like (10), P2-like (5), lambda-like (3), unclassified phage (3), 
epsilonl 5-like (1), unclassified Myoviridae (1), unclassified 
Podoviridae (1), unclassified Siphoviridae (1) 


LhP3 


1454673-1465841 


11,169 


58.70 


19 


14 


BPP-1-like (3), lambda-like (3), epsilonl 5-like (2), unclassified 
Podoviridae (2), PI -like viruses (1), unclassified Myoviridae (1), 
unclassified Siphoviridae (1), unclassifed phage (1) 


LhP4 


1477589-1511963 


34,375 


58.78 


36 


23 


BPP-1-like (14), P4-like (4), P2-like (1), P22-like (1), epsilonl 5-like 
(1), unclassified Siphoviridae (1), unclassified Myoviridae (1) 


LhP5 


1568789-1612785 


43,997 


59.00 


64 


32 


Mu-like (9), lambda-like (7), unclassified Podoviridae (5), 
unclassified phage (4), unclassified Myoviridae (3), unclassified 
Siphoviridae (2), P2-like (1), P22-like (1) 


LhP6 


1671244-1693161 


21,918 


62.04 


31 


25 


unclassified Siphoviridae (12), lambda-like (5), unclassified 
phage (3), T1 -like (2), unclassified Myoviridae (2), unclassified 
Podoviridae (1) 


LhP7 


1888197-1908188 


19,992 


55.59 


31 


18 


unclassified phage (7), Mu-like (4), unclassified Myoviridae (2), 
lambda-like (1), P22-like (1), unclassified Podoviridae (1), 
unclassified Siphoviridae (1), BPP-1-like (1) 


LhP8 


2462791-2496581 


33,791 


63.87 


48 


37 


P2-like (30), Myoviridae (2), unclassified Siphoviridae (2), 
unclassified phage (2), Mu-like (1) 
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Figure 17 Dot-plot analysis for LhPl, LhP8 and E. coli phage P2. (A) Dot-plot alignment of LhP8 sequences (vertical axis) versus LhP1 
sequences (horizontal axis). (B) Dot-plot alignment of LhP1 sequences (vertical axis) versus Enterobocterio phage P2 sequences (horizontal axis). 
(C) Dot-plot alignment of LhP8 sequences (vertical axis) versus Enterobocterio phage P2 sequences (horizontal axis). 



with 10 of the 25 phage-related CDSs most similar to 
CDSs in Mu-like phages of C. violaceum (CvPl), Hae- 
mophilus influenzae and N. meningitidis. There are also 
other CDSs similar to other phage genes of lambda- and 
P2-like phages. 
LhP3 

Bacteriophage LhP3 is the smallest prophage in the L. 
hongkongensis genome. It is composed of 19 CDSs, 
accounting for 11,169 bp with G + C content 58.70%, 
lower than that of the host genome (62.35%), reflecting 
its heterologous origin. Of the 19 CDSs, 14 were phage- 
related CDSs with similarity to genes of BPP-1 -, lambda- 
and epsilonl5-like phages and other unclassified phages, 
indicating its genetic complexity. Further studies are 
required if this relatively small prophage is a functional 
tailed phage. 
LhP4 

Bacteriophage LhP4 is composed of 36 CDSs, account- 
ing for 34,375 bp with G+C content 58.78%, also lower 
than that of the host genome, indicating its heterologous 
origin. Of the 23 phage-related CDSs, 14 possessed simi- 
larity to genes of Bordetella phage BPP-1. Other phage 
related genes resemble those of P4-, P2-, P22- and epi- 
solonl5-like phages and unclassified phages of Siphoviri- 
dae and Myoviridae. 
LhP5 

Bacteriophage LhP5 is the largest prophage identified in 
the L. hongkongensis genome. Composed of 64 CDSs, it 
accounts for 43,997 bp with G+C content 59%, lower 
than that of the host genome. Of the 32 phage-related 
CDSs, 9 possessed homologies to genes of Mu-like 
phages, 7 even possessed homologies to genes of 
lambda-like phages. The other phage-related CDSs are 
most closely related to those of various phages including 
those belonging to Podoviridae (phages with short tails), 
Myoriviridae and Siphoviridae. 



LhP6 

Bacteriophage LhP6 is composed of 31 CDSs, account- 
ing for 21,918 bp with G+C content of 62.04%. The 25 
phage-related CDSs exhibit similarity to phage genes of 
Bordetella bronchiseptica and Bordetella avium. Of 
these 25 CDSs, 12 possessed homologies to genes of 
unclassified phages belonging to Siphoviridae and 5 to 
lambda-like phages. 
LhP7 

Bacteriophage LhP7 is composed of 31 CDSs, account- 
ing for 19,992 bp with the lowest G+C content of 
55.59% among the eight prophages, suggesting a hetero- 
logous origin. Of the 18 phage-related CDSs, 4 exhibits 
similarity to phage genes of N. meningitidis, Burkhol- 
deria, and C. violaceum genes of Mu-like phages, and 
others to those of unclassified phages, lambda-, P22-, 
and BPP-1 -like phages. 
LhP8 

Similar to LhPl, bacteriophage LhP8 is also a P2-like 
phage (Figure 17). It is composed of 48 CDSs, account- 
ing for 33,791 bp with G+C content of 63.87%, similar 
to that of the host genome. It contains the highest num- 
ber of phage-related CDSs (n = 37) among the eight 
phages. Of the 37 phage-related CDSs, 30 were most 
similar to genes of P2-like phages and others to phages 
of Myoviridae, Siphoviridae and Mu-like phages. In fact, 
LhPl and LhP8 are highly similar with the exception of 
a few CDSs, with most of their CDSs exhibiting similar- 
ity to phage proteins found in other gram-negative bac- 
teria including Salmonella, Burkholderia, Yersinia, and 
Shigella species. Their gene organizations are also highly 
similar to P2 phage (Table 6) (Figure 17). 
Remnant phages 

Among the eight putative prophages, LhPl and LhP8 
are most likely to represent intact prophages, while the 
remaining six prophages encode a diversity of prophage 
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Table 6 CDSs of LhPI and LhP8 from the L hongkongensis HLHK9, and comparison of genome structures of LhPI 
(reverse complement), LhP8 and E. coli P2 phage. 



P2 

Q 
P 
0 
N 
M 
L 
X 
Y 
K 



lysA 
lysB 
lysC 

R 

S 

V 

W 

J 



LhPI 



LhP8 



Function 



G 

Z/fun 

Fl 

Fll 

E+E' 

E 

T 

U 
D 



LHK_00420 
LHK_00419 
LHK_00418 
LHK_00417 
LHK_00416 
LHK_00415 
LHK_00414 



LHK_00413 
LHK_00412 
LHK_0041 1 

LHK_00410 

LHK_00409 
LHK_00408 
LHK_00407 

LHK_00406/LHK_00405 

LHK_00404 

LHK_00403 

LHK_00402 

LHK_00401 

LHK_00400 

LHK_00399 

LHK_00398 

LHK_00397 



LHK_00396 
LHK_00395 
LHK_00394 
LHK_00393 
LHK_00392 
LHK_00391 
LHK_00390 
LHK_00389 
LHK_00388 
LHK_00387 
LHK_00386 
LHK_00385 



Ogr 



LHK_00384 
LHK_00383 
LHK_00382 
LHK_00381 
LHK_00380 



LHK_02579 

LHK_02580/LHK_02582 

LHK_02581/LHK_02583 

LHK_02584 

LHK_02585 

LHK_02586 

LHK_02587 



LHK_02588 
LHK_02589 
LHK_02590 

LHK_02591 

LHK_02592 
LHK_02593 
LHK_02594 
LHK_02595 
LHK_02596 
LHK_02597 
LHK_02598 
LHK_02599 
LHK_02600 
LHK_02601 
LHK_02602 
LHK_02603 



LHK_02604 
LHK_02605 
LHK_02606 
LHK_02607 
LHK_02608 
LHK_02609 
LHK_02610 



LHK_0261 1 
LHK_02612 
LHK_02613 
LHK_02614 
LHK_02615 
LHK_02616 
LHK_02617 

LHK_02618 



Capsid portal protein 
Large terminase subunit 
Capsid scaffold 
Major capsid precursor 
Small terminase subunit 
Capsid completion 
Tail 

Lysis - holin 
Lysis - endolysin 

Phage-related transmembrane protein 
Hypothetical protein 

Putative phage-related protein (hydrolase) 

Timing of lysis 

Timing of lysis 

Regulation of lysis 

Hypothetical protein 

Tail completion 

Tail completion 

Baseplate assembly 

Baseplate assembly 

Baseplate assembly 

Baseplate assembly 

Tail fiber 

Hypothetical protein 

Mu-like prophage protein Com 

DNA adenine methylase 

Hypothetical protein 

Tail fiber assembly 

Blocks phage T5 

Tail sheath 

Tail tube 

Tail 

Tail 

Tail 

Tail 

Tail 

Hypothetical protein 
Hypothetical protein 
Hypothetical protein 
Hypothetical protein 
Hypothetical protein 
Anthranilate synthase component I 
Hypothetical protein 
Hypothetical protein 
Hypothetical protein 
Hypothetical protein 
Late promoter activator 
Hypothetical protein 
Hypothetical protein 
Hypothetical protein 
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Table 6 CDSs of LhPI and LhP8 from the L. hongkongensis HLHK9, and comparison of genome structures of LhPI 
(reverse complement), LhP8 and E. coli P2 phage. (Continued) 







1 HK 09619 

l_l II \ \J Z-\J \ J 


Hvnnthptirsl nrntpin 

1 IVUUU ICTll^QI kJl^l^TII 1 




LHK 00379 


LHK 02620 


Crn/C\ familv trarKrrintinnal rpnulatnr 




1 HK 00378 

l_l 1 1\ \J\J \j I O 


1 HK 07671 

l_l ll\ \JZ-\JZ- I 


Hvnnthptir^l nrntpin 

1 \yyJ\J\.\ ICLI^ul |-JI^ICII 1 


Int 






Intpnr^^p 
1 1 i icy i ajc 


c 






Immi mitv rpnrp^nr 

II 1 II 1 IUI HIV 1 CUI CjjUI 


Cox 






Inhibits integration 


B 






DNA replication 


A 


LHK_00377 


LHK_02622 


DNA replication 




LHK_00376 




Hypothetical protein 




LHK_00375 




Hypothetical protein 






LHK_02623 


Hypothetical protein 






LHK_02624 


DNA binding protein, excisionase family 


tin 






Blocks growth of T-even phages 


old 






Blocks growth of phage lambda 




LHK_00374 


LHK_02625 


Integrase 



elements of phage-related structural and non-structural 
proteins. In addition to these putative prophages, 17 
other phage-related CDSs were found scattered in the L. 
hongkongensis genome. However, these CDSs are either 
not flanked by other phage-related genes or that the 
region of these phage-related gene clusters was too 
short for confident prediction as prophages. Further stu- 
dies are required to ascertain if the present putative pro- 
phages and phage-related gene clusters are intact or 
remnant phages. 

Transposases and insertion sequences 

There are 26 CDSs coding for transposases in the L. 
hongkongensis genome (Table 7). Fourteen of these 26 
transposases possessed homologies to transposases of 
IS3 family, nine to those of IS5 family and three to 
those of IS481 family. The presence of transposases of 
IS481 family is unique in L. hongkongensis, as they are 
absent in other members of the Neisseriaceae family 
such as the pathogenic Neisseria species and C. viola- 
ceum [114]. The transposases of L. hongkongensis are 
most closely related to those of other members of P- 
proteobacteria, especially of the order Burkholderiales, 
with seven most closely related to those of Comamonas 
testosteroni, seven to those of Janthinobacterium sp., 
and four to those of Polaromonas sp. However, only two 
pairs of these transposases carry short imperfect 
inverted repeats at their ends that form insertion 
sequences most closely related to the IS3 family. Other 
transposases are likely remnant insertion sequences and 
lack associated inverted repeats. The first insertion 
sequence, of 1,183 bp, contains two ORFs, LHK_01280 
(ORFb) and LHK_01281 (ORFa), with 38-bp inverted 
repeats with six mismatches. The second insertion 
sequence is relatively short in length, with 603 bp 



containing two ORFs, LHK_02311 and LHK_02312 
(ORFa) and 50-bp inverted repeats with ten mismatches. 
The G+C content of both putative insertion sequences 
are lower (57.4% and 54.89% respectively) than that of 
the L. hongkongensis genome, suggestive of heterologous 
origin. 

Conclusions 

The L. hongkongensis genome possessed genes and gene 
cassettes for acid and bile resistance, colonization of the 
intestinal mucosa, evasion of host defense and cytotoxi- 
city and invasion. In addition, a broad variety of antibio- 
tic resistance or multidrug resistance genes, a high 
number of prophages, together with other phage-related 
CDSs and CDSs coding for transposases, were also 
identified. 

Methods 

CDSs identified in the L. hongkongensis genome were 
annotated as described in our previous publication and 
classified functionally according to the Clusters of 
Orthologous Groups (COG) methodology [10]. CDSs 
belonging to COG clusters potentially associated with 
virulence (such as intracellular trafficking, secretion and 
vesicular transport) were selected for further examina- 
tion, whereas those associated with housekeeping func- 
tions (such as chromatin structure and dynamics) were 
removed. The CDSs were then examined by comparison 
with the latest release of the reference Virulence Factor 
Database (VFDB) [115] and keyword searching using the 
following words and their variants: virulence, toxin, 
hemolysin/hemolysis, pathogenicity, adherence, invasion, 
secretion, phagocytosis, phase variation, stress, iron 
uptake, siderophore, resistance, efflux pump, damaging 
and regulation. For drug resistance, CDSs that were 
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Table 7 Transposases identified in the genome of L. hongkongensis HLHK9 



CDS 


IS name 


IS family 


IS group 


Origin 


Identity (%) 


E-value 


Size (bp) 


LHK_00816 


ISCte3 


IS3 


IS407 


Comamonas testosteroni 


79.76 


2e-38 


294 


LHK_00817 


ISCte3 


IS3 


IS407 


Comamonas testosteroni 


77.01 


9e-36 


264 


LHK_0091 1 


ISAisp3 


IS481 




Acidovorax sp. 


62.5 


7e-54 


588 


LHK 01023 


ISJsp2 


IS5 


IS903 


Janthinobactenum sp. 


47.92 


6e-50 


822 


LHK_01024 


ISPosp5 


IS3 


IS3 


Polaromonas sp. 


71.21 


2e-24 


339 


LHK 01025 


ISPosp5 


IS3 


IS3 


Polaromonas sp. 


65.98 


1e-33 


336 


LHK 01171 


ISPosp5 


IS3 


IS3 


Polaromonas sp. 


71.21 


2e-24 


342 


LHK_01 172 


ISPosp5 


IS3 


IS3 


Polaromonas sp. 


65.98 


2e-33 


336 


LHK_01280 


ISKpnIO 


IS3 


IS407 


Klebsiello pneumoniae 


68.29 


2e-50 


360 


LHK 01281 


ISKpnIO 


IS3 


IS407 


Klebsiella pneumoniae 


84.09 


2e-39 


267 


LHK 01366 


ISJsp2 


IS5 


IS903 


Janthinobactenum sp. 


60.68 


4e-36 


744 


LHK 01998 


ISJsp2 


IS5 


IS903 


Janthinobactenum sp. 


75.86 


2e-09 


186 


LHK_01999 


ISPpa4 


IS5 


IS903 


Paracoccus pantotrophus 


69 


2e-17 


219 


LHK_02013 


ISRme14 


IS481 




Ralstonia metallidurans 


59.46 


2e-22 


249 


LHK 02014 


ISAisp3 


IS481 




Acidovorax sp. 


71.26 


8e-27 


276 


LHK 02015 


ISJsp2 


IS5 


IS903 


Janthinobactenum sp. 


60.53 


7e-35 


540 


LHK 02176 


ISJsp2 


IS5 


IS903 


Janthinobactenum sp. 


58.02 


7e-23 


273 


LHK_02311 


ISCte3 


IS3 


IS407 


Comamonas testosteroni 


88.89 


6e-09 


141 


LHK_02312 


ISCte3 


IS3 


IS407 


Comamonas testosteroni 


78.05 


8e-14 


126 


LHK_02314 


ISCte3 


IS3 


IS407 


Comamonas testosteroni 


96 


3e-20 


399 


LHK_02540 


ISJsp2 


IS5 


IS903 


Janthinobactenum sp. 


59.54 


9e-40 


648 


LHK_02711 


IS476 


IS3 


IS407 


Xanthomonas campestris pv. vesicatoria 8 


1-23 race 2 63.64 


7e-42 


387 


LHK_02712 


IS1421 


IS5 


IS427 


Ralstonia solanacearum 


57.38 


8e-33 


357 


LHK_02720 


ISCte3 


IS3 


IS407 


Comamonas testosteroni 


72.73 


4e-14 


627 


LHK_02721 


ISCte3 


IS3 


IS407 


Comamonas testosteroni 


77.01 


4e-12 


264 


LHK_03256 


ISJsp2 


IS5 


IS903 


Janthinobactenum sp. 


62.79 


3e-27 


477 



classified to COG V (defense mechanisms), COG Q 
(secondary metabolites biosynthesis, transport and cata- 
bolism), and COQ M (cell wall/membrane/envelope bio- 
genesis) were manually annotated for identification of 
antibiotic resistance-related genes. CDSs from other 
COGs were searched for additional genes using key- 
words: resistance antibiotic, efflux, multi etc. Prophages 
were identified by Prophage finder http://bioinformatics. 
uwp.edu/~phage/ searches [116]. The genome was run 
under the parameters with an e-value of 0.01, hits per 
prophage of 7, and hit spacing of 5000. Transposases 
were identified by performing BlastP analyses for all 
CDSs identified in the genome of L. hongkongensis 
HLHK9 against the ISfinder database http://www-is.bio- 
toul.fr/is.html[117] and inverted repeats by einverted 
(EMBOSS package) [118]. Manual confirmation of the 
assigned function was performed by sequence similarity 
search using BLAST against the NCBI nr database, and 
assisted by conserved domain search (CD-search), iden- 
tification of signature sequence motifs and sequence 
analysis using InterProScan. Localization patterns of 
putative virulence factors were predicted using PSORTb 
where appropriate [119]. 
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